Overview
Brought to you by YData
Dataset statistics
| Number of variables | 15 |
|---|---|
| Number of observations | 891 |
| Missing cells | 537 |
| Missing cells (%) | 4.0% |
| Duplicate rows | 1 |
| Duplicate rows (%) | 0.1% |
| Total size in memory | 101.1 KiB |
| Average record size in memory | 116.1 B |
Variable types
| Numeric | 6 |
|---|---|
| Categorical | 5 |
| Unsupported | 4 |
| Dataset has 1 (0.1%) duplicate rows | Duplicates |
Age is highly overall correlated with FamilySize and 5 other fields | High correlation |
FamilySize is highly overall correlated with Age and 5 other fields | High correlation |
Fare is highly overall correlated with Age and 2 other fields | High correlation |
Has_Cabin is highly overall correlated with Age and 6 other fields | High correlation |
Parch is highly overall correlated with Has_Cabin and 1 other fields | High correlation |
PassengerId is highly overall correlated with Age and 5 other fields | High correlation |
Pclass is highly overall correlated with Has_Cabin and 2 other fields | High correlation |
SibSp is highly overall correlated with isAlone | High correlation |
Survived is highly overall correlated with Age and 5 other fields | High correlation |
Title is highly overall correlated with isAlone | High correlation |
isAlone is highly overall correlated with Age and 8 other fields | High correlation |
Title has 537 (60.3%) missing values | Missing |
Name is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Sex is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Ticket is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Embarked is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
SibSp has 71 (8.0%) zeros | Zeros |
Parch has 141 (15.8%) zeros | Zeros |
Reproduction
| Analysis started | 2025-10-15 05:37:18.226862 |
|---|---|
| Analysis finished | 2025-10-15 05:37:31.545081 |
| Duration | 13.32 seconds |
| Software version | ydata-profiling vv4.17.0 |
| Download configuration | config.json |
Variables
PassengerId
Real number (ℝ)
High correlation
| Distinct | 354 |
|---|---|
| Distinct (%) | 39.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 170.56902 |
| Minimum | 1 |
|---|---|
| Maximum | 889 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 312.5 |
| 95-th percentile | 765 |
| Maximum | 889 |
| Range | 888 |
| Interquartile range (IQR) | 311.5 |
Descriptive statistics
| Standard deviation | 265.13839 |
|---|---|
| Coefficient of variation (CV) | 1.5544346 |
| Kurtosis | 0.3729066 |
| Mean | 170.56902 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.3406645 |
| Sum | 151977 |
| Variance | 70298.367 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 538 | |
| 646 | 1 | 0.1% |
| 588 | 1 | 0.1% |
| 586 | 1 | 0.1% |
| 582 | 1 | 0.1% |
| 581 | 1 | 0.1% |
| 579 | 1 | 0.1% |
| 578 | 1 | 0.1% |
| 572 | 1 | 0.1% |
| 568 | 1 | 0.1% |
| Other values (344) | 344 |
| Value | Count | Frequency (%) |
| 1 | 538 | |
| 2 | 1 | 0.1% |
| 4 | 1 | 0.1% |
| 8 | 1 | 0.1% |
| 9 | 1 | 0.1% |
| 10 | 1 | 0.1% |
| 11 | 1 | 0.1% |
| 14 | 1 | 0.1% |
| 17 | 1 | 0.1% |
| 19 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 889 | 1 | |
| 886 | 1 | |
| 881 | 1 | |
| 880 | 1 | |
| 875 | 1 | |
| 872 | 1 | |
| 870 | 1 | |
| 867 | 1 | |
| 864 | 1 | |
| 862 | 1 |
Survived
Categorical
High correlation
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.1 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 716 | |
| 0 | 175 | 19.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 716 | |
| 0 | 175 | 19.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 716 | |
| 0 | 175 | 19.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 716 | |
| 0 | 175 | 19.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 716 | |
| 0 | 175 | 19.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 716 | |
| 0 | 175 | 19.6% |
Pclass
Categorical
High correlation
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.1 KiB |
| 1 | |
|---|---|
| 3 | |
| 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 644 | |
| 3 | 167 | 18.7% |
| 2 | 80 | 9.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 644 | |
| 3 | 167 | 18.7% |
| 2 | 80 | 9.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 644 | |
| 3 | 167 | 18.7% |
| 2 | 80 | 9.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 644 | |
| 3 | 167 | 18.7% |
| 2 | 80 | 9.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 644 | |
| 3 | 167 | 18.7% |
| 2 | 80 | 9.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 644 | |
| 3 | 167 | 18.7% |
| 2 | 80 | 9.0% |
Name
Unsupported
Rejected Unsupported
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 7.1 KiB |
Sex
Unsupported
Rejected Unsupported
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 7.1 KiB |
Age
Real number (ℝ)
High correlation
| Distinct | 70 |
|---|---|
| Distinct (%) | 7.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.175275 |
| Minimum | 0.42 |
|---|---|
| Maximum | 70 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.1 KiB |
Quantile statistics
| Minimum | 0.42 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 24 |
| 95-th percentile | 44 |
| Maximum | 70 |
| Range | 69.58 |
| Interquartile range (IQR) | 23 |
Descriptive statistics
| Standard deviation | 15.633329 |
|---|---|
| Coefficient of variation (CV) | 1.3989212 |
| Kurtosis | 0.64660117 |
| Mean | 11.175275 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.3274418 |
| Sum | 9957.17 |
| Variance | 244.40097 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 544 | |
| 28 | 51 | 5.7% |
| 24 | 13 | 1.5% |
| 18 | 13 | 1.5% |
| 25 | 11 | 1.2% |
| 2 | 10 | 1.1% |
| 36 | 10 | 1.1% |
| 4 | 10 | 1.1% |
| 29 | 9 | 1.0% |
| 31 | 9 | 1.0% |
| Other values (60) | 211 | 23.7% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.1% |
| 0.67 | 1 | 0.1% |
| 0.75 | 2 | 0.2% |
| 0.83 | 2 | 0.2% |
| 0.92 | 1 | 0.1% |
| 1 | 544 | |
| 2 | 10 | 1.1% |
| 3 | 6 | 0.7% |
| 4 | 10 | 1.1% |
| 5 | 3 | 0.3% |
| Value | Count | Frequency (%) |
| 70 | 1 | 0.1% |
| 65 | 1 | 0.1% |
| 64 | 1 | 0.1% |
| 63 | 1 | 0.1% |
| 60 | 3 | |
| 58 | 2 | 0.2% |
| 56 | 1 | 0.1% |
| 54 | 5 | |
| 53 | 1 | 0.1% |
| 52 | 3 |
SibSp
Real number (ℝ)
High correlation Zeros
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.1257015 |
| Minimum | 0 |
|---|---|
| Maximum | 8 |
| Zeros | 71 |
| Zeros (%) | 8.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.90809186 |
|---|---|
| Coefficient of variation (CV) | 0.80668978 |
| Kurtosis | 27.456592 |
| Mean | 1.1257015 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.531717 |
| Sum | 1003 |
| Variance | 0.82463083 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 746 | |
| 0 | 71 | 8.0% |
| 2 | 28 | 3.1% |
| 4 | 18 | 2.0% |
| 3 | 16 | 1.8% |
| 8 | 7 | 0.8% |
| 5 | 5 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 71 | 8.0% |
| 1 | 746 | |
| 2 | 28 | 3.1% |
| 3 | 16 | 1.8% |
| 4 | 18 | 2.0% |
| 5 | 5 | 0.6% |
| 8 | 7 | 0.8% |
| Value | Count | Frequency (%) |
| 8 | 7 | 0.8% |
| 5 | 5 | 0.6% |
| 4 | 18 | 2.0% |
| 3 | 16 | 1.8% |
| 2 | 28 | 3.1% |
| 1 | 746 | |
| 0 | 71 | 8.0% |
Parch
Real number (ℝ)
High correlation Zeros
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.98428732 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 141 |
| Zeros (%) | 15.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.6549552 |
|---|---|
| Coefficient of variation (CV) | 0.66541059 |
| Kurtosis | 12.751808 |
| Mean | 0.98428732 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.2045182 |
| Sum | 877 |
| Variance | 0.42896632 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 655 | |
| 0 | 141 | 15.8% |
| 2 | 80 | 9.0% |
| 5 | 5 | 0.6% |
| 3 | 5 | 0.6% |
| 4 | 4 | 0.4% |
| 6 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 141 | 15.8% |
| 1 | 655 | |
| 2 | 80 | 9.0% |
| 3 | 5 | 0.6% |
| 4 | 4 | 0.4% |
| 5 | 5 | 0.6% |
| 6 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 6 | 1 | 0.1% |
| 5 | 5 | 0.6% |
| 4 | 4 | 0.4% |
| 3 | 5 | 0.6% |
| 2 | 80 | 9.0% |
| 1 | 655 | |
| 0 | 141 | 15.8% |
Ticket
Unsupported
Rejected Unsupported
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 7.1 KiB |
Fare
Real number (ℝ)
High correlation
| Distinct | 142 |
|---|---|
| Distinct (%) | 15.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20.004069 |
| Minimum | 1 |
|---|---|
| Maximum | 512.3292 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 23.45 |
| 95-th percentile | 89.1042 |
| Maximum | 512.3292 |
| Range | 511.3292 |
| Interquartile range (IQR) | 22.45 |
Descriptive statistics
| Standard deviation | 41.972976 |
|---|---|
| Coefficient of variation (CV) | 2.0982219 |
| Kurtosis | 31.886125 |
| Mean | 20.004069 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.5877342 |
| Sum | 17823.625 |
| Variance | 1761.7307 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 537 | |
| 26 | 22 | 2.5% |
| 14.4542 | 7 | 0.8% |
| 31.275 | 7 | 0.8% |
| 16.1 | 7 | 0.8% |
| 69.55 | 7 | 0.8% |
| 15.5 | 7 | 0.8% |
| 26.25 | 6 | 0.7% |
| 24.15 | 6 | 0.7% |
| 46.9 | 6 | 0.7% |
| Other values (132) | 279 |
| Value | Count | Frequency (%) |
| 1 | 537 | |
| 6.4958 | 1 | 0.1% |
| 7.0458 | 1 | 0.1% |
| 7.0542 | 1 | 0.1% |
| 7.2292 | 2 | 0.2% |
| 7.25 | 1 | 0.1% |
| 7.75 | 2 | 0.2% |
| 7.775 | 2 | 0.2% |
| 7.8542 | 3 | 0.3% |
| 7.925 | 5 | 0.6% |
| Value | Count | Frequency (%) |
| 512.3292 | 1 | 0.1% |
| 263 | 4 | |
| 262.375 | 2 | |
| 247.5208 | 2 | |
| 227.525 | 1 | 0.1% |
| 211.5 | 1 | 0.1% |
| 211.3375 | 2 | |
| 164.8667 | 2 | |
| 153.4625 | 2 | |
| 151.55 | 3 |
Embarked
Unsupported
Rejected Unsupported
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 7.1 KiB |
Has_Cabin
Categorical
High correlation
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.1 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 647 | |
| 0 | 244 | 27.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 647 | |
| 0 | 244 | 27.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 647 | |
| 0 | 244 | 27.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 647 | |
| 0 | 244 | 27.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 647 | |
| 0 | 244 | 27.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 647 | |
| 0 | 244 | 27.4% |
FamilySize
Real number (ℝ)
High correlation
| Distinct | 9 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.9046016 |
| Minimum | 1 |
|---|---|
| Maximum | 11 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 6 |
| Maximum | 11 |
| Range | 10 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.6134585 |
|---|---|
| Coefficient of variation (CV) | 0.84713704 |
| Kurtosis | 9.159666 |
| Mean | 1.9046016 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.7274415 |
| Sum | 1697 |
| Variance | 2.6032485 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 537 | |
| 2 | 161 | 18.1% |
| 3 | 102 | 11.4% |
| 4 | 29 | 3.3% |
| 6 | 22 | 2.5% |
| 5 | 15 | 1.7% |
| 7 | 12 | 1.3% |
| 11 | 7 | 0.8% |
| 8 | 6 | 0.7% |
| Value | Count | Frequency (%) |
| 1 | 537 | |
| 2 | 161 | 18.1% |
| 3 | 102 | 11.4% |
| 4 | 29 | 3.3% |
| 5 | 15 | 1.7% |
| 6 | 22 | 2.5% |
| 7 | 12 | 1.3% |
| 8 | 6 | 0.7% |
| 11 | 7 | 0.8% |
| Value | Count | Frequency (%) |
| 11 | 7 | 0.8% |
| 8 | 6 | 0.7% |
| 7 | 12 | 1.3% |
| 6 | 22 | 2.5% |
| 5 | 15 | 1.7% |
| 4 | 29 | 3.3% |
| 3 | 102 | 11.4% |
| 2 | 161 | 18.1% |
| 1 | 537 |
isAlone
Categorical
High correlation
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.1 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 537 | |
| 0 | 354 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 537 | |
| 0 | 354 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 537 | |
| 0 | 354 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 537 | |
| 0 | 354 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 537 | |
| 0 | 354 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 537 | |
| 0 | 354 |
Title
Categorical
High correlation Missing
| Distinct | 5 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 537 |
| Missing (%) | 60.3% |
| Memory size | 7.1 KiB |
| Mr | |
|---|---|
| Mrs | |
| Miss | |
| Master | |
| Rare | 7 |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 3.2514124 |
| Min length | 2 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Mr |
|---|---|
| 2nd row | Mrs |
| 3rd row | Mrs |
| 4th row | Master |
| 5th row | Mrs |
Common Values
| Value | Count | Frequency (%) |
| Mr | 120 | 13.5% |
| Mrs | 105 | 11.8% |
| Miss | 82 | 9.2% |
| Master | 40 | 4.5% |
| Rare | 7 | 0.8% |
| (Missing) | 537 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| mr | 120 | |
| mrs | 105 | |
| miss | 82 | |
| master | 40 | 11.3% |
| rare | 7 | 2.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| M | 347 | |
| s | 309 | |
| r | 272 | |
| i | 82 | 7.1% |
| a | 47 | 4.1% |
| e | 47 | 4.1% |
| t | 40 | 3.5% |
| R | 7 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1151 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| M | 347 | |
| s | 309 | |
| r | 272 | |
| i | 82 | 7.1% |
| a | 47 | 4.1% |
| e | 47 | 4.1% |
| t | 40 | 3.5% |
| R | 7 | 0.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1151 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| M | 347 | |
| s | 309 | |
| r | 272 | |
| i | 82 | 7.1% |
| a | 47 | 4.1% |
| e | 47 | 4.1% |
| t | 40 | 3.5% |
| R | 7 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1151 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| M | 347 | |
| s | 309 | |
| r | 272 | |
| i | 82 | 7.1% |
| a | 47 | 4.1% |
| e | 47 | 4.1% |
| t | 40 | 3.5% |
| R | 7 | 0.6% |
Interactions
Correlations
| Age | FamilySize | Fare | Has_Cabin | Parch | PassengerId | Pclass | SibSp | Survived | Title | isAlone | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | 0.844 | 0.895 | 0.675 | -0.230 | 0.868 | 0.491 | -0.047 | 0.591 | 0.411 | 0.883 |
| FamilySize | 0.844 | 1.000 | 0.933 | 0.591 | 0.067 | 0.925 | 0.495 | 0.146 | 0.511 | 0.195 | 0.642 |
| Fare | 0.895 | 0.933 | 1.000 | 0.071 | -0.094 | 0.924 | 0.069 | 0.027 | 0.114 | 0.049 | 0.448 |
| Has_Cabin | 0.675 | 0.591 | 0.071 | 1.000 | 0.573 | 0.664 | 0.940 | 0.460 | 0.618 | 0.079 | 0.753 |
| Parch | -0.230 | 0.067 | -0.094 | 0.573 | 1.000 | -0.117 | 0.392 | -0.091 | 0.490 | 0.220 | 0.735 |
| PassengerId | 0.868 | 0.925 | 0.924 | 0.664 | -0.117 | 1.000 | 0.480 | -0.007 | 0.546 | 0.055 | 0.901 |
| Pclass | 0.491 | 0.495 | 0.069 | 0.940 | 0.392 | 0.480 | 1.000 | 0.387 | 0.649 | 0.172 | 0.762 |
| SibSp | -0.047 | 0.146 | 0.027 | 0.460 | -0.091 | -0.007 | 0.387 | 1.000 | 0.438 | 0.242 | 0.537 |
| Survived | 0.591 | 0.511 | 0.114 | 0.618 | 0.490 | 0.546 | 0.649 | 0.438 | 1.000 | 0.497 | 0.605 |
| Title | 0.411 | 0.195 | 0.049 | 0.079 | 0.220 | 0.055 | 0.172 | 0.242 | 0.497 | 1.000 | 1.000 |
| isAlone | 0.883 | 0.642 | 0.448 | 0.753 | 0.735 | 0.901 | 0.762 | 0.537 | 0.605 | 1.000 | 1.000 |
Missing values
Sample
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Embarked | Has_Cabin | FamilySize | isAlone | Title | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22.0 | 1 | 0 | A/5 21171 | 7.2500 | S | 0 | 2 | 0 | Mr |
| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38.0 | 1 | 0 | PC 17599 | 71.2833 | C | 1 | 2 | 0 | Mrs |
| 2 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.0000 | 1 | 1 | 1 | 1 | NaN |
| 3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35.0 | 1 | 0 | 113803 | 53.1000 | S | 1 | 2 | 0 | Mrs |
| 4 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.0000 | 1 | 1 | 1 | 1 | NaN |
| 5 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.0000 | 1 | 1 | 1 | 1 | NaN |
| 6 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.0000 | 1 | 1 | 1 | 1 | NaN |
| 7 | 8 | 0 | 3 | Palsson, Master. Gosta Leonard | male | 2.0 | 3 | 1 | 349909 | 21.0750 | S | 0 | 5 | 0 | Master |
| 8 | 9 | 1 | 3 | Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) | female | 27.0 | 0 | 2 | 347742 | 11.1333 | S | 0 | 3 | 0 | Mrs |
| 9 | 10 | 1 | 2 | Nasser, Mrs. Nicholas (Adele Achem) | female | 14.0 | 1 | 0 | 237736 | 30.0708 | C | 0 | 2 | 0 | Mrs |
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Embarked | Has_Cabin | FamilySize | isAlone | Title | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 881 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.000 | 1 | 1 | 1 | 1 | NaN |
| 882 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.000 | 1 | 1 | 1 | 1 | NaN |
| 883 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.000 | 1 | 1 | 1 | 1 | NaN |
| 884 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.000 | 1 | 1 | 1 | 1 | NaN |
| 885 | 886 | 0 | 3 | Rice, Mrs. William (Margaret Norton) | female | 39.0 | 0 | 5 | 382652 | 29.125 | Q | 0 | 6 | 0 | Mrs |
| 886 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.000 | 1 | 1 | 1 | 1 | NaN |
| 887 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.000 | 1 | 1 | 1 | 1 | NaN |
| 888 | 889 | 0 | 3 | Johnston, Miss. Catherine Helen "Carrie" | female | 28.0 | 1 | 2 | W./C. 6607 | 23.450 | S | 0 | 4 | 0 | Miss |
| 889 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.000 | 1 | 1 | 1 | 1 | NaN |
| 890 | 1 | 1 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1 | 1.000 | 1 | 1 | 1 | 1 | NaN |
Duplicate rows
Most frequently occurring
| PassengerId | Survived | Pclass | Age | SibSp | Parch | Fare | Has_Cabin | FamilySize | isAlone | Title | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 1 | 1 | 1.0 | 1 | 1 | 1.0 | 1 | 1 | 1 | NaN | 537 |